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Abstract. According to the Furstenberg-Zimmer structure theorem, every 
measure-preserving system has a maximal distal factor, and is weak mixing 
relative to that factor. Furstenberg and Katznelson used this structural analy- 
sis of measure-preserving systems to provide a perspicuous proof of Szemeredi's 
theorem. Beleznay and Foreman showed that, in general, the transfinite con- 
struction of the maximal distal factor of a separable measure-preserving system 
can extend arbitrarily far into the countable ordinals. Here we show that the 
Furstenberg-Katznelson proof does not require the full strength of the maxi- 
mal distal factor, in the sense that the proof only depends on a combinatorial 
weakening of its properties. We show that this combinatorially weaker prop- 
erty obtains fairly low in the transfinite construction, namely, by the ui^ th 
level. 



1. Introduction 

Let X ~ {X, B, fi, T) be a measure preserving system, that is, a finite mea- 
sure space (X, B, fi) together with a measure-preserving transformation, T. A (T- 
invariant) factor y of such a system is said to be distal if it is the last element of 
an increasing finite or transfinite sequence {ya)a<e of factors, such that is the 
trivial factor, for each a < 9, ya+i is compact relative to 3^q, and for each limit 
ordinal 7 < 0, 3^^ is the limit of the preceding factors. A structural analysis due 
to Furstenberg and Zimmer, independently, shows that every measure preserving 
system has a maximal distal factor, and is weak mixing relative to that factor (see 

iEi). 

Furstenberg [6] proceeded to give an ergodic-theoretic proof of Szemeredi's the- 
orem that used only a finite sequence of compact extensions of the trivial factor. 
But he noted, in passing, that one could give an alternate proof using the maximal 
distal factor. Furstenberg and Katznelson [HI [7] in fact used this strategy to prove a 
multidimensional generalization of Szemeredi's theorem. Even for the original ver- 
sion of the theorem, the Furstenberg-Katznelson proof (which draws on ideas from 
Ornstein, and is presented in is perhaps the cleanest and most perspicuous 
proof of Szemeredi's theorem to date. 

Beleznay and Foreman [5] have shown that for the separable spaces that arise 
in the proofs of Szemeredi's theorem, the transfinite construction of the maximal 
distal factor can extend arbitrarily far into the countable ordinals. It is therefore 
striking that the proof of a finitary combinatorial result can make use of such a 
transfinite construction in an essential way. 
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Our goal here is to provide a precise sense in which the Furstenberg-Katznelson 
proof docs not "need" the fuU transfinitc hierarchy. SpecificaUy, we show that the 
argument does not require that X is weak mixing relative to a distal factor 3^; 
rather, it is enough to know that 3^ is a limit of distal factors with respect to which 
X exhibits sufficient approximations to weak mixing behavior. We show that such 
distal factors always occur fairly low down in the transfinite hierarchy, in fact, by 
the oj" th level. This helps clarify the combinatorial role of the maximal distal 
factor in the Furstenberg Katznelson argument, and the axiomatic strength needed 
to carry out the proof. 

A central theme here is that if instead of exact limits one is interested in having 
only sufficiently large pockets of approximate stability, one can often obtain better 
bounds, uniformity, and/or computability results. We referred to this phenomenon 
as "local stability" in [3^ ; Tao [T9j [20] has used the term "metastability" in a similar 
sense. In particular, we will rely on a metastability analysis of the mean ergodic 
theorem due to Kohlenbach and Leu§tean [T3^. 

The outline of this paper is as follows. In Section [21 we briefly outline the 
Furstenberg-Katznelson proof of Szemeredi's theorem, introducing the relevant def- 
initions. In Section [3l we state our main results, which are then proved in Sec- 
tions [D to [HI In Section [71 we describe the logical methods that underlie our work, 
and draw conclusions about the axiomatic strength of the principles needed in the 
Furstenberg-Katznelson proof. 

We are very grateful to our anonymous referees for comments, suggestions, and 
corrections, and to Ulrich Kohlenbach for helping us simplify the proofs in Section[4l 

2. Preliminaries 

Szemeredi's Theorem states that for every k and (5 > there is an N large enough 
so that if 5* is any subset of {1, 2, ... , iV} with density at least (5, then S contains an 
arithmetic progression of length k. Furstenberg [5] showed that this is equivalent 
to the statement that for every measure preserving system X ^ every k, and every 
set A of positive measure, there is an n such that Ai(n/</c T^'"^) > 0. We will 
henceforth refer to this measure-theoretic equivalent as Szemeredi's theorem. 

The T-invariant factors of a measure-preserving system (X, S, /i, T) are naturally 
identified with the sub-cr-algebras B' of B that are closed under the map A M- T^^A. 
It is fruitful to adopt a Hilbert-space perspective, and consider the space L'^{X) of 
square integrable functions on X, with the isometry T which maps / to foT. Any T- 
invariant factor gives rise to the T-invariant subspace y of S'-measurable functions 
of L^{X). This space contains all the constant functions, and is closed under the 
map / i-T- max(/, 0). Conversely, any such space gives rise to a corresponding 
factor. We will henceforth use T instead of T to denote the relevant isometry on 
L'^{X), and use the term "factor of X" to mean a T-invariant subspace of L'^(X) 
containing the constant functions and closed under the map / i— ;> max(/, 0). If A 
is an element of B, "A in 3^" means that the characteristic function xa of A is in 

which amounts to saying that A is in the corresponding cr-algebra. 

If 3^ is a factor of X, the expectation operator E{f \ y) denotes the projection 
of / onto y. More information about factors and the expectation operator can be 
found, say, in [7]. For the most part, we will be able to restrict our attention to the 
subset L°°(A') of essentially bounded elements of L'^{X), and we will use L°°{y) to 
denote the essentially bounded elements of the factor y. 
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The Furstcnberg-Zimmcr structure theorem shows that any measure-preserving 
system X has a maximal distal factor, that is, a factor y that is built up using a 
transfinite sequence of compact extensions; and that X is weak mixing relative to 
y. We now briefly review the definitions and provide a more precise statement of 
the theorem. 

Definition 2.1. If y is a factor of X, we say X is weak mixing relative to y if 
for every f and g in L°°{X), 

1™ - E / [EifT'g I 3^) - E{f I y)E{rg I y)] ^d^, = o. 

The following lemma presents two important consequences of relative weak mix- 
ing. The first provides a sense in which weak mixing extensions are also "weak 
mixing of all orders." The second shows that if X is weak mixing relative to y, 
then y is "characteristic" for the averages of the form ^ J2i<n Y[i<k 'E'^^fi, in the 
sense that only the projections of /o, . . . , fk-i on y bear on the limiting behavior. 

Lemma 2.2. Suppose X is weak mixing relative to y. Then for every k and for 
all functions fo, . . . , fk in L°°{X), the following hold: 

i<n \ l<k Kk J 

^E(n^'v/-n^''^(/'i^)) 

i<n \l<k Kk / 



and 



lim 

n— )-oo 



= 0. 

Given a factor 3^, write {f,g)y for E{fg \ y){y); this provides a "bundle" of 
Hilbert spaces indexed by elements y oi X (defined up to almost everywhere equiv- 
alence). A function / in L^{X) is said to be almost periodic relative to y if 
for every S > 0, there is a finite set of functions go,...,gk in L^{X) such that 
mini<fc II/ — 5i||y < S for almost every y in X. Another factor Z D y is said to be a 
compact extension of y if every element of 2^ is a limit of functions that are almost 
periodic relative to y. The space Z{y) spanned by the functions that are almost 
periodic relative to y is called the maximal compact extension of y. 

Lemma [2.31 below, provides another characterization of Z{y). Given X and a 
factor, y, the square of X relative to y, X Xy X, is defined in [3 [HI IS] • Here we 
only need the following characterization of the Hilbert space L'^{X Xy X). Start 
with formal elements consisting of sums X]i<n fi® where fi and gi are elements 
of L°°{X). Define an inner product on these elements by taking 

{f^g,h® k)y = {E{fh I y),E{gk \ y)), 

where the right-hand side refers to the usual inner product on L'^(X), and extending 
to finite sums using bilinearity. Then L^{X xy X) is, up to isomorphism, the 
completion of this space under the associated norm. One can show that for any 
h in L°°{y), the elements hf ® g and f ® kg are identified by the norm, and so 
one can view L°°{y) as a embedded in L^{X Xy X) via the map h ^ h ® 1; m 
particular, the real numbers are embedded as elements c®\. The projection of an 
element f (i) g owy \s then given by 

E{f®g\y)^E{f\y)E{g\y). 
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The action of T on L'^{X xy X) is obtained by taking T{f ® g) ^ T f ®Tg and 
extending it to the rest of the space. 

One can define niultiphcation by an element f ® ghy setting (/ ® g) ■ {h®k) = 
{fh (E) gk). Integration in L^{X Xy X) is given by 

y"/d(/i Xy^i) = 1). 

In particular, if h is in L°°{y), 

J h c?(/i Xy ^) = J h dii. 

There is also a lattice structure on L^{X Xy X) derived from that on L'^(X); all we 
will need below is that if / and g are elements of L°°{X), then \\f <E) g\\L^(XxyX) < 
■ \\9\\l--{x)- 

If H is any element of L°°{X Xy X) of the form X]i<n ® 9i ''^^'^ f 
define 

H*yf = Y,E{fh, I y)h. 

The *y operation then extends to arbitrary elements of L^{X Xy X) by taking 
limits. For any H in L°°{X Xy X), the operation / n> H *y f is a, bounded 
linear operator, with \\H *y JWl^^x) < \\H\\oo ■ llfh^ix) (see, for example, [Tj pages 
130-131]). 

We will be particularly interested in elements of L°°(X xy X) of the form 

H-^-J2T\g®g), 

where g is in L°°{X). The mean ergodic theorem implies that the functions Hg 
converge to a limit, Hg, in L'^{X Xy X). For each n, ||oo, and hence ||i?g||oo, 
is bounded by HffH^. One can show, moreover, that for any fixed g, the sequence 
{Hg *y /) has a rate of convergence that depends only on a bound on ||/||oo. We 
will make use of this uniformity in Section [51 

The following fact is established in [H [U [7] , and implicitly in [9] : 

Lemma 2.3. Z{y) is the space spanned by the set of elements of the form Hg h<j; f , 
as f and g range over L°°(X). 

Moreover, if X is not weak mixing relative to 3^, then then there are elements 
Hg *y f not in y. Hence: 

Lemma 2.4. If X is not weak mixing relative to y, then Z{y) D y. 

Now define 3^o to be the trivial factor, consisting of the constant functions. By 
transfinite recursion, define 3^a+i = ^(3^a) for every a, and define y\ to be the 
factor spanned by U7<a -^7 ^'^^ every limit ordinal A. Since L^{X) is separable, we 
have ya+i — Z{ya) = at some countable ordinal a. By Lemma [2^ X is weak 
mixing relative io y. y = 3^q, is called the maximal distal factor. 

Definition 2.5. Say that the factor y is SZ if for every k and A in y with ii{A) > 
0, 

liminf i y ^(pl T-'^A) > 0. 

i<rL l<k 
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In particular, Szemeredi's theorem follows from the statement is SZ." In [9], 
this is proved as follows: 

• The trivial factor is SZ. 

• If a factor Z is SZ, so is Z{Z). 

• If each of a sequence Zq, Zi, Z2, . . . of factors is SZ, then so is the factor 
spanned by IJ^ Z^. 

• If a factor Z is SZ, and X is weak mixing relative to Z, then <Y is SZ. 

The first three clauses imply that the maximal distal factor, is SZ. The last 
implies that A' is SZ, as required. 



The set of countable ordinals can be given a quick inductive definition: is a 
countable ordinal; if a is a countable ordinal, then so is a + 1; and if ao, ai, a2, • • • is 
an increasing sequence of countable ordinals, so is their least upper bound, which we 
will denote sup„a„. Addition, multiplication, and exponentiation can be defined 
recursively (see, for example, I15|), and w is defined to be sup„ n. 

It is common to identify each ordinal a with the set {/? | /? < a} of ordinals less 
than it. The ordinals serve as representatives of the order types of well-founded 
orderings, which is to say, if (X, ^) is any well-founded ordering, then (X, -<) is 
isomorphic to (a, <) for some ordinal a. The arithmetic operations then have 
natural combinatorial interpretations. The ordinal a; represents the order type of 
the natural numbers, and a -I- 1 represents the order type obtained by appending a 
single element to an ordering of type a. The ordinal a -I- /3 represents an ordering 
of type a followed by an order of type /3. The ordinal a ■ l3 represents /? copies of 
an order of type a, that is, the order type of /3 x a under lexicographic order. The 
interpretation of the ordinal is slightly more complicated: it represents the set 
of functions from (3 to a that are nonzero at only finitely many arguments, where 
the order is obtained by comparing the values at the largest input where they differ. 
Of course, for natural numbers n, a" can be identified with the n-fold product of a 
with itself. Many familiar properties of addition, multiplication, and exponentiation 
on the natural numbers hold for the extensions to the ordinals, but not all. For 
example, addition and multiplication are associative but not commutative, since 
1+^ = 0; and 2 ■ uj = uj. 

Our main theorem is that an approximation to the first property of the maximal 
distal factor given in Lemma 1 2 . 2 1 holds fairly low down in the Furstenberg-Zimmer 
tower. 

Theorem 3.1. For every k, all Junctions /o, . . . , fk-i in L°°{X), and every e > 0, 
there are n and a < such that for every m > n, 



In fact, our Lemma [6.8l m'Oves something stronger, namely that given /o, . . . , fk~i 
and e > there is an n with "many" such a < uj'^ , in an appropriate combinatorial 
sense. We obtain the following as a consequence of this stronger fact: 



3. Main results 
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Corollary 3.2. For every k, all functions /o, . . . , fk~ 
there are n and a < uj^ such that for every m > n, 



in L°°{X), and every e > 0, 



i<m \l<k 



Kk 



< e. 



We emphasize that ahhough Theorem 13.11 is new, Corollary 13.21 is not: using 
an altogether different argument, Furstenberg [5] showed that for each fc, is 
characteristic for the averages with fc-fold products. Our methods are quite gen- 
eral, however, and work in other situations involving transfinite constructions of 
factors; see j21|. Moreover, our argument provides some insight into the role of the 
maximal distal factor in the Furstenberg-Katznelson argument, providing a general 
explanation as to why the full strength of the construction is not needed to obtain 
the combinatorial result. 

It is worth noting that for k ~ 2, Theorem 13.11 describes a weaker version of 
relative weak mixing. In that case, the discussion at the end of Section [5] shows 
that the theorem holds with uj in place of w'^ . It is not hard show that here w 
cannot be replaced by any finite ordinal K. Otherwise, fixing /o = /i = /, we 
would have that for every e > there is an a < K such that the conclusion of 
the theorem holds. By the pigeonhole principle, this would imply that there is a 
single a < K that works for every e, which is to say, / is weak mixing relative to 
ya- But, by the results of Beleznay and Foreman [5], there are measure preserving 
systems with functions / that are not weak mixing relative to any finite level of 
the Furstenberg-Zimmer hierarchy. So, for such functions, the least a satisfying the 
conclusion of Theorem 13.11 must approach a; as e approaches 0. Our proof gives an 
explicit bound on a depending on k and e; we do not know the extent to which 
that bound is sharp. 

For fc > 2, the statement of Lemma 16.81 gives slightly more information, in terms 
of a bound less than depending on fc. But, once again, we do not know the 
extent to which this bound is sharp, nor even that a bound of uj itself is insufficient. 

Note that our corollary is even weaker than saying that some J^q , with a < , is 
characteristic for the limit in question. But, as we now show, once we know that J^q 
is SZ for each a less than or equal to , this strictly weaker property is sufficient 
to obtain Szemeredi's theorem. In fact, the proof is only a slight modification of 
the usual Furstenberg-Katznelson argument, e.g. [9] Theorem 8.3]. 

Theorem 3.3. X is SZ. 

Proof. Suppose we are given a set A in B such that > 0. Since 

i<n 1=0 i<n'' Kk 

our goal is to show that there is a ^ such that the right-hand side is greater than 6 
for sufficiently large n. 

For each j, let aj be the least ordinal such that for sufficiently large n, 



i<n \l=0 



1=0 



Set a — sup aj < , so that ya is the factor spanned by IJ 3^q 
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Since XA is nonnegative, so is E{xA | 3^a)- Let 

B = {x\ E{xA I yc.){x) > ^liA)/2}. 

Since 

^IiA) = f E{xA I yc.)dti + (_E{xA I 3^a)d/i < ^i{B) + /i(A)/2, 

J B J B 

it follows that > /i(A)/2. Since is SZ, there is a (5 such that 

i<n 1=0 

whenever n is sufficiently large. 
For each j, set 

B^={xeB\ E(xA I yc.,){x) > m(^)/4}. 

Since is the limit of the factors 3^^ j we can make fi{B — Bj) as small as we want 
by making j sufficiently large. We will choose j large enough so that ^{B ~ Bj) < 
5 /{2k), so that for any i we have 

Kf] T-''B,) > A.(n T-^'B) - k ■ {5/{2k)) 

Kk 

^^l{{^T-^'B)-5/2. 



Kk Kk 



Kk 



Then, since E{xa I ^qj) > ^^XBj, we wih have 

/ li^'^EixA I y.^)d, > ^'-T. j \{T''xB,d, 

Kn'' Kk Kn'' Kk 



4*= n 

i<n l<k 
i<n Kk 



_ M ■ <5 

22fe+l 

for sufficiently large n. Call the right-hand side ry. 

Choose j so that in addition to satisfying fj,{B — Bj) < 5 /(2k), we also have 
1/ j < ri/2. Then, by the construction of the sequence (aj), we have 

i E / n ^''xAd^. ^ ^ E / n ^"^(XA 1 3^.. ) - v/2 



>'?/2, 

for sufficiently large n, as required. □ 

We now turn to the proof of Theorem 13. II Our proof tracks the usual proof that 
X is weak mixing of all orders relative to the maximal distal factor, 3^; but wherever 
that proof asserts that X exhibits some behavior relative to y, we assert instead 
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that X exhibits some approximation to that behavior, relative to sufficiently many 
3^Q. The following definitions provide the notions of "sufficiently many" that we 
will need. If 9 and rj are ordinals, {0,rj\ denotes the interval {5 \ 6 < 5 < rj}. 

Definition 3.4. If a is an ordinal, say s is an a-sequence if s — (s^)^<q is a 
strictly increasing sequence of ordinals indexed by ordinals less than or equal to a. 
Say t is a /3-subsequence of s if t is a (^-sequence and a subsequence of s. If s is 
an a-sequence, the span of s, written span(s), is (sojSa]- 

Definition 3.5. // s is an a-sequence and P{S) is any property, say P holds for 
s-many 6 if for every (3 < a, there is a 5 in (s^,s^+i] such that P{5) holds. 

In other words, P{5) holds for s-many 5 if, roughly speaking, there is an element 
satisfying P between any two consecutive elements of s. 

4. Approximating the mean ergodic theorem 

Let T-L be any Hilbert space, T an isometry, and / any element of %. For 
every n > 1, let Anf — (l/'^) X]i<n ^*/- The mean ergodic theorem says that 
the sequence {A„f) converges in the Hilbert space norm; in other words, for every 
s > Q, there is an n such that for every m > n we have ||^m/ — Anf\\ < e. 

Now let {'Ha)aes be a sequence of Hilbert spaces indexed by ordinals in some 
set S, let (Tq) be a sequence of isometries, and let (/„) be a sequence of elements. 
Given e > 0, the mean ergodic theorem implies that for every a there is an n as 
above, but, of course, different a's may call for different n's. 

Here we will be concerned with the case where the spaces Tia are the ones 
denoted by L'^{X Xy^ X) in Section [51 and for some L°°{X) function /, each 
is the element f ® f 'm the corresponding space. Our goal is to obtain for every 
e > a single n that works for sufficiently many a's. In Section [5l we will use this 
to show that approximate weak mixing behavior occurs sufficiently often relative 
to the factors y^- 

Our original presentation relied on information extracted in [3] from the proof of 
the mean ergodic theorem due to Riesz [TC] . We are grateful to Ulrich Kohlenbach 
for pointing out the proofs of the results in this section could be simplified consider- 
ably by using information extracted by Kohlenbach and Leustean T3] from a proof 
of the mean ergodic theorem by Garrett BirkhofF [10] . The following lemma is im- 
plicit in |13j . and holds more generally for nonexpansive mappings on a uniformly 
convex Banach space. It says, roughly, that from a bound on k such that ||^fc/|| 
is close to its infimum, one can determine a value n beyond which the sequence of 
ergodic averages is close to its limit. 

Lemma 4.1. For every B and e > there is a j > with the following property: 
for every i there is an n such that if f is any element of a Hilbert space H with 
\\f\\^B,Tisan isometry, and there is a k < i such that 

(1) ||Afe/||<||A,/||+7 
holds for every j , then 

\\Anf-Arnf\\ < £ 

for every m > n. 

Proof. Using the notation of [TJ], let M = 16B/e, let n — Mi, and let 7 = 
(£:/16)77(e/8&), where 77 is a modulus of convexity for Hilbert space. The proof in [T31 
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Section 4, pages 1913-1914] shows that if ([T|) holds for every j, then — < 

s holds for every m and I greater than or equal to n. (The N in 13 plays the role 
of our i, and P corresponds to our n. Because we are assuming that ([l} holds for 
all J, the conclusion of the argument in ^13j holds for arbitrary functions g.) □ 

We now fix a sequence of Hilbert spaces {T-la)aes, where S is some set of ordinals 
and each Ha comes equipped with its own inner product and norm || • ||q,. 

We also fix an isometry Tq, on each Ha- The next theorem deals with sequences 
{fa)a€Sj where each fa is in Ha- For readability, we will adopt the practice of 
dropping the subscripted a on terms like fa and Tq, when the context makes it 
clear. Thus, for example, the expression ||A„/[|q, really means ||A„/q||q,. 

Although the sequences (Anf) converge in each T-La, they may have very different 
rates of convergence. The next lemma shows that, nonetheless, as long as there is a 
uniform bound on the values ||/I|q, for any e > there is always an n large enough 
so that, for "many" a's, ||A„/ — ^m./H < e holds for all m > n. 

Theorem 4.2. Let e > and B > 0. Then there is a natural number K such that 
for every -sequence s and every sequence of elements (/(5)5gspan(s) bounded by B 
in norm, there are a natural number n and an a-subsequence t of s, such that the 
property 

\\Anf - AjnfWs < £ for every m>n 
holds for t-many S. 

Proof. For each i, write ai^s — infj<i HAj/aHa. According to the convention above, 
we will leave the subscripted i5's off of fs and ai^s but keep the dependence in mind. 
For each S, the sequence Oi is a decreasing sequence bounded above by B and below 
by 0. Let 7 be as guaranteed to exist by Lemma 1411 

Now let K = [5/7] + 1, let s be any a^-sequence, and let {fs)si£span{s) be a 
sequence of elements bounded by B in norm. It suffices to show that there are a 
natural number i and an a-subsequence t of s such that the property 
for every j > i, ai < aj + 7 

holds for i-many S, because then the hypotheses of Lemma 14.11 and hence the 
conclusion, are satisfied for these S's. 

Suppose otherwise. Then we have the following (*): 

For every i and a-subsequence t of s, there are j > i and (3 < a 
such that for every S S (s/3, s^+i], aj < — 7. 

Start with zq = 0, in which — \\f\\. Think of s as consisting of a- many 

consecutive a^'^^^-subsequences, overlapping only at the endpoints, so that the last 
element of one is the first element of the next. We can then use (*) to find an ii > io 
and one of those subsequences such that < — 7 on its span. Then think of 
that subsequence as consisting of a-many consecutive Q;^~^-subsequences, and use 
(*) again to find an i2 > ii and one of those sequences such that a^^ < a^^ — 7 on its 
span. Continuing in this way we ultimately find a 6 and a sequence Oij, , a^^ , . . . , a^^ 
such that for each u < K we have Oi^^-^ < a^^ — 7 at (5. But this contradicts the 
fact that, by the choice of K, can decrease by 7 at most K times. □ 

We now speciahze to the situation where each Ha is L'^{X Xy^ X), and each 
/a is / (8> /, for some fixed L°°{X) function /. This meets the requirements of the 
lemma, because we have = {f (g> f, f (E> f)a = ! E{f^ \ yafdix < ||/||^ for 
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each a. Thus we have a uniform approximate version of the mean ergodic theorem 
for L^{X xy^X). 

Theorem 4.3. Let e > and B > 0. Then there is a natural number K such that 
for every -sequence s and every f in L°°{X) with ||/||oo < B, there are a natural 
number n and an a-subsequence t of s, such that the property 

for every m>n, P„(/0/) ~ A^{f C$ f )\\s < e 
holds for t-many 5. 

Notice that if s is the trivial l-sequence 6,6 + 1, Theorem 14.31 simplv asserts that 
^n(/ ® /) converges in X xy^^^ X. 

5. Approximating weak mixing 

Let g be in L°°{X). Now notice that the elements of the spaces L'^{X Xy^ X), 
defined in Section [21 are none other than the elements An{g ® g), where An is as in 
Sectional Let / be any element of L'^{X). As we observed in Section [21 the rate 
of convergence of Hg *yg f to Hgf in L^{X Xy^ X) depends only on the rate of 
convergence of to Hg and on ||/||L2(;t^). 

We now use this to obtain our first main result, to the effect that X exhibits 
approximate weak mixing behavior relative to the factors ys, for sufficiently many 
ordinals 6. 

Theorem 5.1. For every e > and B > there is a natural number K such that 
for every a > w, every -sequence s, and every f and g with ||/||oo < B and 
II.9II00 < B, there are an n and an a-subsequence t of s, such that the property 

for every m>n,^ J [E{fT^g \ ys) - E{f \ ys)T'E{g \ ys)] ' d^i < 

e 

holds for t-many 6. 

Proof. For any 6, if we set hs equal to / — E{f | (5), we have 
l^Hj I - I ys)T'Eig I ys)fdt, 

i<im 

= ^ E / [^ihsT'a + E{f I ys)Tg I ys) - E{f I ys)T'E{g \ ys)] ' d/i 

i<.m 

^T.J [E{hsT'9\ys)]"dfr 

i<rn 

E{hsT'g I ys)E{hsT'g \ ys) dfi 
[ E{hs- y T'gE{hsT'g \ ys) \ ys) dfi 
E{hs ■ (i/g™ *y, hs) I ys) dfM 

= J hs ■ {H;; *y, hs) dfi. 

Here is the idea: by Theorem 14.31 we can make i7™ *y^ hs close to Hg *y^ hs 
for sufficiently many 6. By the definition of the transfinite sequence of factors 



m — ^ 
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(3^15), Hg hs is in ys+i- On the other hand, hs+i is orthogonal to ys+i, so 
J hs+i ■ {Hg hs) d^i is equal to 0. Thus, as long as 

hs+i - hs ^ E{f \ ys+i) - E{f \ ys) 

is small, / hs ■ {H^ ^y^^ hs) d/i will be close to 0, as required. 

But now suppose we obtain a countable sequence 5o < 5i < 62 < ■ ■ ■ oi ordinals, 
where H"^ hs^ is close to Hg ^y^^, hs- for each i. Then since {E{f \ ySi))i&i is 
a sequence of projections of / onto increasing factors, for some i we will have that 
E{f I ySi^i) — E{f I ySi), and hence hs — hs+i, is sufficiently small. Such a Si is 
then one of the ordinals we are after. 

The details are as follows. Given e > 0, apply Lemma [4.31 to e/2B, and let K 
satisfy the conclusion of that lemma. We claim that 2K satisfies the conclusion of 
Theorem 15.11 

Suppose we are given an a^-^-sequence s, and / and g satisfying ||/||oo < B and 
||<7||oo < B. Since a > w, we have a^^ = {a^)^ > {uj ■ a)^ , and we can restrict our 
attention to the initial (w ■ Q;)'^-subsequence of s. By our choice of K, there is an 
UJ ■ a-subsequence t such that the property (*) 

for every TO > n and ft, with ||/i||i2(;f) < B, \\Hg"- *y^ h — Hg*y^ h\\ < 

e/2 

holds for <-many 6. 

Let t' be the a-sequence obtained by taking every wth element of t; that is. That 
is, define t'^ = t^./j for each P < a. We claim that the property (**) 

for every m > n, f hs ■ {H™ hs) d^i < e 

holds for f'-many (5, as required. 

To prove this, let (5 < a. We need to show that there is a (5 satisfying 

tuj-p — t'p < 5 < t'pj^i — t^.f}^^ 

such that / hs ■ {H™ *y^ hs) dfi < e. By our choice of t, for every i there is a 
Si 6 {ti^.fj^i,tuj-p+i+i] satisfying (*) with Si in place of S. Choose i such that 

\\hs^+,-hs^\ = \\E{f I ys,+i)-Eif I 3^,J|| < e/2B^. 

Now for S — Si, we have 

hs ■ (i/™ *y, hs) = hsUH^ *y, hs) - {Hg ^y, hs)) + 

{hs - hs+i) ■ {Hg *y^ hs+i) + hs+i ■ {Hg *y^ hs). 

For every to > n, by (*), the first term is bounded in L'^{X) norm by ||/i5||oo • £/2i?, 
which is less than e/2, since since Hft^Hoo < B. The second term is bounded in L?'{X) 
norm by {e/2B'^) ■ WHg^y^^ hs+i\\oo, which is less than e/2, since ||ffg||oo < B^ . The 
integral of the last term is 0, since hs+i is orthogonal to ys+i and Hg *y^ hs is an 
element of ys+i- Hence we have / hs ■ {Hp *y^ hs) dfj, < e, as required. □ 

Notice that, in the previous proof, we did not really need an (w ■ Q;)-sequence 
t satisfying (*); an {L ■ Q;)-sequence would have been sufficient, with L > 
Furthermore, if a is any limit ordinal, then L-a = a. Note also that we could just as 
well have switched the two steps of thinning s: starting with an (a^ ■L)-sequence s, 
we could have obtained an a-^-subsequence t' such that \\E{f\y.y) — E{f\ys)\\ < e/2 
for every 7 and i5 in the span of t' , and then applied Lemma 14.31 to obtain an a- 
subsequence t such that (*) holds for i-many S. In particular, any sequence of length 
L is sufficient to obtain a 1-sequence t such that the conclusion of Theorem IS . II holds 
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for t-many S, which is to say, at least one S in the span of t. This shows that for 
k — 2, Theorem 13.11 holds with u in place of . 

6. Approximating weak mixing of all orders 

In this section, we show how to approximate the property of being weak mixing 
of all orders relative to the maximal distal factor below level uj'^ in the Furstenberg- 
Zimmer tower. Our proof parallels the proof in [5] that the fact that X is weak 
mixing relative to y implies that it is weak mixing of all orders relative to y; but 
wherever that proof asserts that some property holds relative to y, we assert that 
a corresponding property holds relative to 3^5, for sufficiently many J's. Unlike the 
properties in the previous section, for which sequences of length with integer 
K were sufficient, we will need to consider sequences of the length , where 6 is 
ordinal less than w'^. 

We start by proving three technical lemmas, which correspond to claims that are 
trivial in the original proof, but become more complicated in our modified version. 
To give a typical example, if both 

- J2 \\E{fT^9 I y) - Eif I y)T^Eig \ y)\\ ^ 

and 

- J2 \\E{fT'9 I y) - E{f I y)T'E{g \ y)\\ ^ 0, 

then 

- E + I y) - E{{f + /') I y)TE{g \y)\\^ 0, 

i<rn 

and such inferences are used many times in the proof in [9^ . In our "approximate" 
version, however, we typically wish to show that for each e we can find "many" i5 
such that the third average is less than e with respect to ys , using the fact that the 
first two averages are small with respect to many ys. In particular, this requires 
finding many S such that the first two averages are small simultaneously at yg. 

Since the same situation recurs during the proof with many different choices of 
the precise averages being controlled, we will state the lemmas in a very general 
form. We will work with properties f{S) which assert that a quantity computed 
with respect to yg is small; for instance, in the example above, the first choice of 
ip{f, m, 6) would be 

- J2 \\EifT'g I ys) - Eif I ys)T^Eig \ ys)\\ < e. 

z<m 

We will use the fact that such properties are continuous in the following sense. 

Definition 6.1. A 'property ip(x,S) is continuous in 6 if for any choice of values t 
for X such that (p(t, 5i) holds for all i, also ip{t, supj Si). 

The first lemma says that we can arrange for a pair of continuous properties to 
hold for many 5 simultaneously by arranging for each property, in turn, to hold 
sufficiently often. 

Lemma 6.2. Suppose ipi{x.,S) and ip2{x,5) are continuous in S. Fix x. 
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Suppose there is a Oi < ujP such that for every -sequence s with a > uj and 
every f with < B, there are a natural numberm and an a-subsequence t of 

s such that the property 

for every m>ni, (pi{f,m,6) 
holds for t-many S. 

Suppose that, additionally, there is a 62 < oj"^ such that for every -sequence s 
with a > oj and every f with < B, there are a natural number 712 and an 

a-subsequence t of s such that the property 

for every m>n2, <^2(/, rn, S) 
holds for t-many S. 

Then there is a 9 < w^+i"^ such that for every -sequence s with a > ui and 
every f with < B, there are a natural number n and an a-subsequence t of 

s such that the property 

for every m>n, (pi{f,m,d) and ip2{f,m,S) 
holds for t-many 6. 

Proof. Given 9i and 62 as in the hypotheses, let ^ = 2 ■ 9i ■ 62- Let s be an 
a^'^"^'^^ -sequence, and let / be given. Applying the hypotheses sequentially, we 
obtain an a^-subsequence t' and an n = max(ni,n2) such that both the properties 
Vm > n ipi{f,m,S) and Vm > n '^2{f,m,5) hold for f'-many 5. Since a > u, 
we can consider the a-subsequence t of t' given by setting tp := t'^,^ for each 
13 < a. For any (5 < a and any n < u, there is a 5 in {t'p.^j^n^^'p-uj+n+il such that 
Vm > n ipi{f,m,6) holds, so ordinals with this property occur unboundedly below 
f^+i = In particular, Vm > n <^i(/, m, 4(^+1),^^) and similarly for (^2, so 

the sequence t witnesses the lemma. □ 

We will often want to show that a property f{f, 5) holds for sufficiently many 5 
by decomposing / into E{f \ y^) and / — E{f \ y^). We will be able do this by 
finding a long sequence such that E{f \ y^) does not change much over its span, 
and then dealing with each value, in turn. The next lemma makes this precise. 

Lemma 6.3. Suppose there is a 6 < ujP such that for every a^ -sequence s with 
a > CO and every f with H/Hl^ < B, there are a natural number n and an a- 
subsequence t of s such that the property 

for every m> n, ip{f, m, S) 
holds for t-many 6. 

Suppose also that e > is such that whenever \\f — /'||_l2 < s and (f{f,m,S) 
holds, also (p'{f' ,m,,S). Let ip be continuous in 6. Then there is a 6 < oj'^p^^ such 
that for every a^ -sequence s with a > ui and every f with < B, there are a 

natural number n and an a-subsequence t of s such that the property 

for every m>n, (p'{E{f \ ys),m,S) and <^'(/ - E{f \ ys),m,,S) 
holds for t-many S. 

Proof Give 9 as in the hypothesis, we claim the conclusion holds of 29^ + 1. If s 
is an a^^ "'"^-sequence, we may use the fact that a > w to divide s into w-many 
a^^ -sequences given by Sg = s^ze^.^+s- some n < lo, 

\\E{f\y,n)-E{f\y,n^^j\\<e. 

As in the previous lemma, there is an a-subsequence t of s" such that 
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for every m > n, (p{E{f \ ys^),m,5) and (^(/ - E{f \ ys-g),m,S) 
holds for i-many 6, and the conclusion immediately follows. □ 

Our final technical lemma will give us the means to find many S where two 
properties are satisfied, where the second depends on a parameter that is chosen to 
satisfy the first. 

Lemma 6.4. Suppose there is a Oq < oj^ such that for every a^" -sequence s with 
a > u and every f with < B, there are a natural number Uq and an a- 

subsequence t of s such that the property 

for every m > uq, ipo{f,m,S) 
holds for t-many 6. 

Suppose that, additionally, for every d there is a 9^ < lu'^ such that for every 
a^"^ -sequence s with a > cj and every f with \\f\\i^x, < B, there is a natural number 
Ud and an a-subsequence t of s with the property 

for every m > n^, (pd{f, m, S) 
holds for t-many S. 

If ifi is continuous in S for each i then there is a 9 < uj^^'^ such that for every 
-sequence s with a > w and every f with < B, there are an n, an N, and 

an a-subsequence t of s such that the property 

ifo (/, TV, d) and for every m > n, (^at (/, m, (5) 

holds for t-many S. 

Proof. Let 9 he 2 ■ (sup^^Q 9d) ■ 9, and let s, f be given. By the first assumption, 
there is an a^ ''"P<i>o ^''-subsequence of s, s', and an N such that Lpo{f,N,6) holds 
for s'-many 6. Then there are an a^-subsequence s" and an n such that both 
ipo{f,N,6) holds for s" many 6, and for each m > n, ipff{f,m,S) also holds for 
s"-many 6. Since a > lo, we may apply the method of Lemma 16.21 to obtain an 
a-subsequence t such that the properties hold simultaneously for t-many S. □ 

Recall that if A" is a measure-preserving system and 3^ is a factor, XxyX is again 
a measure-preserving system with factor y. The space L'^{X Xy X) and some of its 
properties were described in Section [51 The operation of taking the relative square 
over 3^ can be iterated; for each r and o, we define the space Xg by induction on 

[Ol [r-l-ll \r] \r] 

r, by setting Xg equal to X, and Xg equal to Xg ' Xy^ Xg '. 

Each space L'^{Xg'^) can be represented as described in Section [2l In particular, 
L°°iys) can be identified as a subset of L'^{Xg'^), and if / and g are elements of 
L°°{Xg'^), then f (E) g is an element of L°°{Xg'^'^^). Thus the most basic elements 
of L^{Xg'^) can be viewed as 2''-fold tensor products of elements of L°°{X). We 

define the simple elements of L^{Xg'^) to he those that can be represented as finite 
sums of such basic elements. 

The advantage to focusing on simple elements is that if / is such an element, 
then / can be viewed as an element of L°°{Xg'^) for each S, simultaneously. More 
precisely, for each r, we define L^{r) to he the set of finite formal sums of such basic 
elements; then each element / of L'^(r) denotes an element of L°°{Xg^), for each 6. 
Note that if / and g are elements of (r) and h is an element of L°° (y) , it makes 
sense to talk about f -\- g, hf, and E{f \ y) as elements of Lg°(r). We may define 
the bound of such a formal sum in the natural way, taking || J2i<n Ci/i||L°= to 
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be J2i<n ' Such a bound is an upper bound for the true bound in 

L°°{Xg'^) for any 5, and respects the usual properties of the L°° norm with respect 
to sums and products. 

The next lemma shows that for each r, we can find many many S such that the 
space Xg looks sufficiently weak mixing. 

Lemma 6.5. For every e > 0, B > 0, and r there is a K < uj such that for every 
-sequence s with a > oj and every f,g ^ L^{r) with \\f\\i^<x, < B, < B, 

there are a natural number n and an a- subsequence t of s such that the property 
for every m > n, 

^T./ [^(/^'s I 3^5) - E{f I ys)T^E{g I ysf d^iixl'^^) < e 

i<ni 

holds for t-many S. 

Proof. By induction on r. When r — 0, this is simply Lemma 15.11 Suppose the 
claim holds for r. It suffices to consider the case where / and g in L'^{r + 1) are of 
the form f = fi® f2 and g = gi®g2, with fi,f2, 91,92 in L^{r). Using Lemma ESI 
and the subadditivity of the left hand side, it suffices to consider the cases where 
Eifi I ys) = and where E{fi \ ys) = Z^; the case where E{fi \ ys) = f^ for 
both i = 1 and i = 2 is trivial, so we may further assume that for some i G {1, 2}, 

EUi \ys)^ 0. 

By the inductive hypothesis and Lemma l6.21 for any e' > we can find K large 
enough so that every a^-sequence s has an a-subsequence t such that 

^ E / [EUiT'gi I ys) - E{f, I ys)E{rg, I ysf dfiixl"-^) < e' 

and 

^ E / [Eif^T^92 I ys) - E{f2 I ys)E{T^g2 \ y^)] ' < e' 

for t-many 6. But then, for such (5, 

^ E / Mh^f2){T^9i^T^92) I = 

't<m 

- E / [EifiT'gi I ys)E{f2rg2 1 ys)]' d^,{xl''^) 

is close to 

^ E / [^(/i I ys)T'Eigi I ys)E{f2 I y5)rEig2 \ ys)]' d^lixl% 

i<7n 

which is since either E{fi | 3^^) = or E{fi \ ys) =0. □ 

From this point on, our proof follows that of 9, Theorem 8.3] very closely. 

Lemma 6.6. Suppose that for every e > 0, B > 0, k, and r there is a 6 < ujP 
such that for every -sequence s with a > ui and every /o, . . . , fk-i in L'^{r) with 
||/i||L°° ^ B, there are a natural number n and an a-subsequence t of s such that 
the property 
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for every m > n, 



fe-i 



fc-i 



f (E{l[T''fi\ys)-l[T^'Eifi\ys)] d^,i4'■^)<s 

i<m \ 1=0 1=0 J 

holds for t-many 5. 

Then for every e > 0, B > 0, k,r there is a 9 < uj^p+^^^ such that for every oP - 
sequence s with a > uj and every fi, ■ ■ ■ , fk with \\fi\\L°° ^ B, there are a natural 
number n and an a-subsequence t of s such that the property 

for every m > n, 



Ck k \ 

l[T''fi-IlT''E{fi\ys)] 
1=1 1=1 / 



Hs < e 



holds for t-many S. 



Proof. Under the additional assumption that for some E{fig \ ys) = 0, we will 
prove the claim with 6 < ojP~^^ . Since 

k k k /j-1 \ / k 

Y[T''fi-i[T^'E{fi 1 3^5) = E n ^-^ (-fj - ^(/^ I ^^)) n ^''^(/' I 

j=i \i=i ) 



1=1 



1=1 



we will then be able to apply Lemma 16.21 fc — 1 times to obtain the full result with 
the stated bound. 

So assume that | 3^^) = 0. By Lemma l631 Lemma lOl and the assumption, 

we may choose a < w^"*"^ so that for every a*-sequence s and every fi, ■ ■ ■ , fk with 
||/j||ioo < 1, there are natural numbers A*" and H and an a-subsequence f of s such 
that for some e > 0, chosen small enough for the argument below, the property 



Jill I [^(/'oT'-^ I y8)~E(x I y8)T''"-E{fi, I ys)]' d^,{xt^) < e/k 



-Y. 



r=l-H 

and for every m > N and |r| < H, 

k k 



-I 2 



EiYlT^'-^y^fiT'^-fi I ys)-Y[T^'-'^'E{fiT'-fi I ys) 



1=1 



1=1 



d^j.{XP) < e/k 



holds for t-many S. It will suffice to argue that these two properties, at any S, imply 
that for some n, 



^T.ii[T''f^-I[T''m\ys) 

i<m \l = l 1=1 / 



< e. 



The necessary n is max{iV, cH} for some large constant c depending on e. Let 
m > n be given. Then, since m is much larger than iJ, it suffices to show that the 
properties above imply 



i+H-l k 



i<m h—i l—l 
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is small. By the convexity of , it suffices to show that 



^ i+H-l k 




m 

Km 

is small. Expanding, this is bounded by 

i+H-l „ k 



1=1 



i<m h.h'=i 1=1 



But this may be rewritten as 



- y: 



r=l-H 



k 

-E /n^^^'-'^x/rt) 



Since we have chosen m> N, this is close to 



1 

H 



H-l 

E 

r=l-H 



1 



E 



which is bounded by 

H-l 



r=l-J? 



But we have chosen H large enough that \\E{figT'-"^ fig \ ys)]] is close to for 
almost every r, and since the terms are bounded by H; ll/i|lTC> the average is small 



as well. 



□ 



Lemma 6.7. Suppose that for every e > 0,B > 0,q,k, and r, there is a 6 < lj^ 
such that for every -sequence s with a> cj and every fi, . . . , fk in L^{2'^~^^) with 
II/HIl'" < B for each I < k, there are a natural number n and an a-subsequence t 
of s such that the property 
for every m>n, 



(k k \ 

^T^'h-^T^'Eifim] 
1=1 1=1 J 



< e 



holds for t-many S. 

Further, suppose that for every s > 0, B > 0,q,k and r, there is a 9 < uj'^ such 
that for every -sequence s with a > uj and every fo, ■ ■ ■ , fk-i in L^{2^) with 
\\fi\\L°° < B for each I < k, there are a natural number n and an a-subsequence t 
of s such that the property 

for every m > n, 

(k—l k—1 \ 

EiU T^'fi \y5)-l[ T^'Eifi I ys) d„{4^) < 6 
1=0 1=0 / 

holds for t-many 6. 

Then for every e > 0,B > 0,q,k, and r, there is a 9 < such that for 

every -sequence s with a > oj and every /o, . . . , fk in L^{2'^) with ||/(||loo < B 
for each I < k, there are a natural number n and an a-subsequence t of s such that 
the property 
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for every m > n. 

k 



1=0 1=0 



holds for t-many S. 

Proof. Once again, we apply Lemma 16.31 and subadditivity to reduce to the two 
cases where E{fo \ ys) — and where E{fQ \ys) = fo- 

In the former case, we may use the first hypothesis to choose witnesses so that 



Ck k \ 

llT''fi-UT''Eifi\ys)] 
1=1 1=1 / 



< e/2. 



Then it suffices to show 



k 

i<ml=l 

But by the choice of witnesses, the left hand side is within e of 

i<m 1 = 1 

and since E{l,j:^^^UliT''E{fi I ys) I 3^.) = ^ E,:<,„ Oti ^"^(/^ | 3^.) and 
E{fo I 3^15) = 0, it follows that this expression is 0. 

In the latter case, we may use the second hypothesis to choose witnesses so that 

^T. f (EiYlT^'fi+i I ys) l[T^'E{fi+, I ys)] < e. 

Km'' \ 1=0 1=0 ) 

Then the left hand side of the desired conclusion is bounded by 

(k k \ ^ 

E(Y[T"f, I 3^,) ^f[T'^Eif, I ys) 
1=1 1=1 / 

and shifting each term by T"'*, this is equal to 

k-l fc-1 '' 



ll/olli-^E J [EiY[T^^fi+,\ys)-l[T''E{fi+,\ys)] d^,[X, 



i<m'' \ 1=0 1=0 

which is less than e. □ 

Lemma 6.8. (1) For every e > 0, B > 0, and k, there is a 6 < uj^^'' such 
that for every -sequence s with a > oj and every /q, . . . in L°°{X) 
with \\fi\\L°° < B for each I < k, there are a natural number n and an 
a-subsequence t of s such that the property 
for every m > n, 

(k k \ 

E{f{ T^'fi \ys)-i{ T''E{fi I ys) d^,{X) < e 
1=0 1=0 ) 

holds for t-many 5. 
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(2) For every e > 0, B > 0, and k, there is a 6 < uj'^ such that for every - 
sequence s with a > oj and every /i, . . . G L°°{X^ ) with ||//||l=° < B 
for each I < k, there are a natural number n and an a-subsequence t of s 
such that the property 
for every m> n, 

^T.{\{T''h-\{T^'E{f,\y,)\ 

i<m \l=l 1=1 / 

holds for t-many S. 

Proof. We will prove the stronger claim that these hold with any Xp in place of 
X and L^{r) in place of L°°{X), simultaneously by induction on k. For k — I, ^ 
is Lemma [6.51 and ([2]) is trivial. Given ([T|) for fc, ([2|) for fc + 1 follows by Lemma 
121 Given @ for fc + 1 and ^ for fc, ([1]) for fc + 1 follows by Lemma UTTl □ 

Theorem 13.11 and Corollary 13.21 follow by taking s to be the a^-sequence with 
3/3 — P for every for every (3 < . 

7. Logical issues 

We now turn to a discussion of the logical methods behind the results just 
obtained. This paper is part of a broader to effort to understand the methods 
of ergodic theory and ergodic Ramsey theory in more explicit computational or 
combinatorial terms [I], using a body of logical techniques that fall under the 
heading "proof mining" (see [T^[T3], as well as [3J Section 6]). In particular, the 
results here were obtained by employing a systematic rewriting of the Furstenberg- 
Katznelson proof [H [71 |9], based on Godel's Dialectica functional interpretation 
[TTl [2] . Here we provide a "rational reconstruction" of the methods we used. 

The first step was to rewrite the key definitions and lemmas in the Furstenberg- 
Katznelson proof in a way that makes the logical structure of the assertions clear, 
and, in particular, distinguishes quantification over ordinals from quantification 
over integers and other objects that have a finitary representation. Limits and 
projections involving the maximal distal factor, 3^, were expressed directly in terms 
of the hierarchy (J^q,). For example, the assertion that the projection E{f \ y) is 
within £ of 5 can be expressed as 3a V/3 > a \\E{f \ y^) — g\\ < e, which asserts 
that there is a level a beyond which the projection stays within e of g. But it can 
also be expressed as Va 3/3 > a {\\E{f | y^) — g\\ < e), which asserts that there are 
arbitrarily large levels /? at which the projection is within e of g. The statement 
that the sequence A„(/ eg) /) converges in X Xy X can then be expressed as follows: 

(2) Ve > 3n Vm > n,a 3/3 > a ® /) - ^„(/ ® f)\\L-iXxy^x) < e)- 

Other statements central to the proof were analyzed in similar ways. 

The proof of the mean ergodic theorem is not constructive [21 11], and, in general, 
once cannot extract bounds on /3 in ((21). The next step was therefore to seek a 
"quasi-constructive" interpretation of the proof which yields more explicit ordinal 
bounds. To that end, we employed a functional interpretation roughly along the 
lines of the one described in [?] (which is, in turn, related to a similar interpretation 
due to Feferman, described in [51 Section 9.3]). For example, in the dependence 



< e 



20 



JEREMY AVIGAD AND HENRY TOWSNER 



of /3 on m can be eliminated by choosing a /3m for each m, and then taking the 
supremum: 

Ve > 3n Va 3/3 (/3 > a A Vm > n \\A^if ® /) - A„(/ f)\\LHXxy^x) < e)- 
We can then make the dependence of /3 on a expHcit: 

(3) Ve > 3n,^ Va (/3(a) > aA 

Vm > n \\A,n{f ® /) - A„(/ ® f)\\L^(^xxy^^^^x) < e)- 

It is still impossible to obtain an explicit description of /3, but the Dialectica in- 
terpretation involves one final move. If (jS]) were false, then for some fixed e > 0, 
there would be a function a(n,/3) that provided a counterexample for each n and 
/3. Thus ([U is equivalent to the assertion that there is no such counterexample: 

(4) Ve > 0,a (^(a(n,/3)) > a(n,/3) A 

Vm > n \\A„,{f ® /) - Anif ® f)\\mxxy^^^^^^^^^x) < e)- 

The logical methods now make it possible to extract an explicit description of the 
function /3 that "foils" the purported counterexample a. Informally, one obtains an 
algorithm for /3 which involves relatively explicit operations with ordinals, such as 
taking maxima and suprema; application and iterations of functions; and possibly 
noncomputable functions on the integers. (The fact that transfinite induction is 
not used in the proof of the mean ergodic theorem for X XyX translates to the fact 
that there are no transfinite recursions in the algorithm. Allowing noncomputable 
functions on the integers allows us to ignore, for example, the universal quantifier 
over m in ([3]), and restrict focus to the parts of the informal proof that bear on the 
ordinal bounds.) More formally, one obtains a term in the calculus denoted in 
[3], involving only the operations just mentioned. 

In the final result. Theorem 13.11 there is only an existential quantifier over or- 
dinals. Methods of Tait [18] (see also [21 Section 4.4]) suggest that the explicit 
witnessing term extracted from the proof should be bounded below the ordinal 
Eq, which is the limit of the ordinals , . . .. The final step of our analysis 

was to seek a more direct route to obtain such a conclusion, both to improve the 
bound and avoid relying on metamathematical considerations. For example, if one 
is interested in bounds rather than explicit witnesses in (|4]), one can assume that 
the function /3 is increasing and continuous. Given any such function, /3, there are 
unboundedly many ordinals 7 that are closed under /3. Inspection of the translated 
proof of ^ showed that it was possible to think of the counterexample function, a, 
as taking such a sequence of closure ordinals, and returning a sequence of bounds on 
counterexamples; the proof showed that the original sequence could be thinned to 
obtain a subsequence along which a fails. Once the decision was made to cast the 
central results in those terms, it was fairly easy to describe the algorithms extracted 
by the functional interpretation in that way. 

The analysis yields not only the additional information provided by Theorem l3.1[ 
but also shows that the argument does not use the full axiomatic strength needed to 
carry out the transfinite iteration. The transfinite construction of the Furstenberg- 
Zimmer structure theorem requires an impredicative theory, like IDj or U^—CA, 
which is, from a proof-theoretic standpoint, quite strong; in contrast, the construc- 
tion of the hierarchy up to stage uj" requires only a principle of iterated arithmetic 
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comprehension along that ordinal, which can be obtained, for example, in the pred- 
icative theory S}~CA. See [U O [17] for more information about the relevant 
theories. 

It is interesting to note, however, that the logical considerations drop out of 
the final results. The metamathematical results provide a deeper understanding 
of the role that strong nonconstructive principles play in ordinary mathematical 
reasoning, and provide a guide to interpreting particular mathematical proofs in 
more explicit terms. But if one is only interested in the latter, at the end of the 
day, one is left with a purely mathematical proof. 
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